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Abstract: 

We study the phase behavior of single homopolymers in a simple hydrophobic/hydrophilic off- 
lattice model with sequence independent local interactions. The specific heat is, not unexpectedly, 
found to exhibit a pronounced peak well below the collapse temperature, signalling a possible low- 
temperature phase transition. The system size dependence at this maximum is investigated both 
with and without the local interactions, using chains with up to 50 monomers. The size dependence 
is found to be weak. The specific heat itself seems not to diverge. The homopolymer results are 
compared with those for two non-uniform sequences. Our calculations are performed using the 
methods of simulated and parallel tempering. The performances of these algorithms are discussed, 
based on careful tests for a small system. 
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1 Introduction 



The thermodynamic behavior of isolated homopolymers is known in quite some detail at and above 
their collapse temperature T — Tg, from analytical work p], H, [j| and numerical simulations of very 
long chains jl], Much less is known about the behavior at low temperatures. Consequently, it 
is of utmost interest to examine the low-T phase behavior and its model dependence, which, in 
particular, may shed light on the mechanism of folding for heteropolymers. 

The possibility of an interesting low-T phase structure for homopolymers was predicted several years 
ago P|. Recent studies R |L pi [l0| of lattice chains with stiffness jl]J have shown that such chains, 
in addition to the coil-globule transition at Tg, exhibit a phase transition from the globule phase 
to a frozen crystalline phase. Also, Zhou et al. ^) have found evidence for two or more phase 
transitions in simple off-lattice models without stiffness terms. 

In this paper, our starting point is a simple off-lattice model 113] for protein folding with two types of 
monomers, A (hydrophobic) and B (hydrophilic) . In addition to "hydrophobicity" forces, the model 
also contains sequence independent local interactions. It has previously been demonstrated [jl4| that 
it is possible to find AB sequences with compact, stable native structures in this model. Furthermore, 
it was shown that this property is crucially dependent on the local interactions. In the present paper 
we perform a detailed study of A homopolymers in this model, both with and without the local 
interactions. The most striking feature of the A homopolymers, at least for small chain lengths, is 
a pronounced peak in the specific heat, located well below Tg. This could signal that these chains, 
like lattice chains with stiffness, undergo a low-T phase transition. It is therefore interesting to look 
into the system size dependence at these temperatures, which is a computationally demanding task. 

In this paper we study the behavior at the low-T maximum of the specific heat for chains with up 
to N — 50 monomers, by using the methods of simulated (l^, |l6[ [lTj and parallel |l9[ |2(], ^l) 
tempering, which are found to be much more efficient than standard methods (see below). In this 
range of N, we find that the size dependence is very weak. There is no sign that the specific heat, 
or any other quantity, develops a singularity with increasing N. This holds both with and without 
the local interactions, although the crossover is somewhat more abrupt in the presence of these 
interactions. 

We also compare homopolymer results to those for two non-uniform AB sequences; three N = 20 
sequences, the A homopolymer and two AB heteropolymers, are studied with and without local 
interactions. We find that the specific heat and entropy display a relatively weak sequence depen- 
dence over wide ranges of low T. The dependence of these thermodynamic quantities upon the local 
interactions is, by contrast, strong. 

Our ability to simulate the low-T behavior of these chains in a controlled way relies heavily on the 
efficiency of simulated and parallel tempering. In order to assess the efficiencies of these algorithms, 
we have performed careful tests for a chain with twelve monomers. We find that the two methods 
are very similar in efficiency, and that they are much better than a standard Monte Carlo. 

The paper is organized as follows. Section || contains a brief description of the model and the two 
Monte Carlo algorithms, and a discussion of the algorithm tests. In Sec. || we present the results of 
our simulations. A summary is given in Sec. |J. 
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Sequence Tf 
~~S AAAA BBAA AABA ABAA ABBA 0.23 
U BAAA AAAB AAAA BAAB AABB < 0.15 

Table 1: Two AB sequences and their folding temperatures Tf for {ki,k 2 ) = (—1,0.5) Hj. 

2 Methods 
2.1 The Model 

The model studied jlij has two types of monomers, A (hydrophobic) and B (hydrophilic) . The 
monomers are linked by rigid bonds of unit length to form linear chains. The positions of the 
monomers will be denoted by Yi (i = 1, . . . ,N), and the (unit) bond vectors by = r^+x — r i 
(i = 1, . . . , JV — 1). The energy function is given by 

E = k 1 E 1 + k 2 E 2 +E IjJ , (1) 

where 

N-2 



E i = - E b * • b *+! > ( 2 ) 

i=l 
N-3 

E 2 = -J2 h i- b »+2 . (3) 



i=l 



N-2 N / ^ ^ 

i=l j=i+2 \ »J *J / 



(4) 



The model has three parameters, «i, K2 and . The parameter sets the depth of the minimum 
of the nonlocal Lennard- Jones interactions. It is 1 for an AA pair and 1/2 for AB and BB pairs. 
The parameters K\ and k 2 determine the strengths of the sequence independent local interactions. 
We will study the two choices (/Ci,«2) = (0,0) and (ki,k 2 ) = (—1,0.5). It has previously been 
shown that there are AB sequences with stable native structures at relatively high temperatures 
for (ki, k 2 ) = (—1, 0.5), whereas there are no, or at least extremely few, such sequences for (k\, k 2 ) = 
(0,0). 

In this paper the main focus is on A homopolymers. In addition, we study two AB heteropolymers, 
denoted by S and U, which can be found in Table [l[ The folding temperature, defined as the 
temperature where the dominance of the ground state sets in, is high for sequence S, and more 
typical for sequence U, for (k±,k 2 ) — (—1,0.5). 

Our (ki, k 2 ) = (0, 0) A homopolymers are almost identical to the chains studied by Baumgartner [^2[ 
E3| several years ago, except that the minimum of our Lennard- Jones potential is at r-y — 2 1 / 6 rather 



than 1. The model studied by Zhou et al. U2, 13 1, for fixed and flexible bond lengths, has a square- 



well potential in place of the Lennard- Jones potential. 
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2.2 Monte Carlo Methods 



Our simulations are performed using simulated |J, [l|, and parallel @, §(J || tempering. 
The basic idea of both these methods is to create an enlarged configuration space. This means that 
a number of different temperatures are studied simultaneously, which in particular can be used as 
a tool for speeding up the evolution of the system at low temperatures. 



In simulated tempering one simulates the joint probability distribution 

P(r,k)<xexp[-9k-E(r)/T k ] , (5) 

where r = {r 1; . . . , r^r} denotes the chain conformation and k is a temperature index. This distribu- 
tion contains two sets of parameters; the temperatures that the system is allowed to visit, {Tj.}, and 
a set of tunable simulation parameters, {gk}- The parameters g k are free in the sense that averages 
corresponding to the desired Boltzmann distribution can be obtained directly, without reweighting, 
independently of these. However, the gk parameters govern the marginal distribution in fc, which 
can be written as 

P(k) oc exp(- 5fc - F k /T k ) , (6) 

where Fk is the free energy at T = T k . Hence, to have good mobility in fc, it is essential to make a 
careful choice of {gk}- This is typically achieved by means of trial runs. The actual simulation of 
the distribution P{r, k) can be done by using ordinary separate updates of r and k. 



Parallel tempering uses an even larger configuration space; the simulated distribution is 

K 

P(n, ...,r K ) tx J] exp[-E(r k )/T k ] , (7) 
fc=i 

where each r k represents a complete chain conformation. The simulation is carried out by using two 
types of updates: ordinary, parallel updates of the different r k and accept/reject controlled swaps 
ffc *-* »"fc+i- The latter update is the analog of the k update in simulated tempering. Compared 
to simulated tempering, this algorithm has two advantages. First, it is very easy to parallelize; 
and second, there are no g k parameters to be tuned. These parameters are not needed since the k 
distribution is uniform by construction. 

Using these two methods, we have studied chains with 12-400 monomers. Our simulations for 
N < 32 were carried out on workstations using simulated tempering, whereas those for larger ./V 
were done on a parallel computer using parallel tempering. The temperatures T k were chosen 
according to p4[ 

(k-l)/(K-l) 



T k = T min [^ , (8) 

\ -L min / 

where T m i n = T\ and T max = Tk denote the lowest and highest allowed temperatures, respectively. 
The parameters T m ; n , T max and K can be found in Table |[ 



2.3 The Dynamics of the Algorithms 



In order to compare the performances of simulated and parallel tempering, we carried out calcu- 
lations with both methods for the N = 12 A homopolymer with (/ti,K2) = (—1,0.5). In these 
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35 








400 
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30 









Table 2: The parameters T mm , T max and if [see Eq. (||)]. 



calculations we used the same set of temperatures (see Table Q), and the same type of conformation 
update. We also performed a standard fixed-T simulation at T = T min = 0.15, using the same 
conformation update. 

In Fig. ^] we show run-time histories for the sum of all torsional angles along the chain, a, from these 
three simulations. The quantity a is convenient to study since it is known from the symmetry of 
the model that its probability distribution has to be symmetric, P{a) = P(— a). When comparing 
the different runs in Fig. [I], the horizontal axis can be thought of as CPU time, since the cost 
of the additional updates used in simulated and parallel tempering is negligible. Therefore, it is 
evident from this figure that both simulated and parallel tempering are dramatic improvements 
on the standard method. Note that the system is trapped at negative a during the whole fixed-T 
simulation, whereas the true distribution P(a) is symmetric. 

Deciding which is best of simulated and parallel tempering is a more delicate task, which requires 
a careful analysis of the statistical errors. To get accurate estimates of the errors, we performed 
two additional runs, a factor of 10 longer than those in Fig. 0. The data fr om these long runs 
were analyzed by using the jackknife method p5[ , which is chosen because the number of visits of 
a given temperature is a random variable in simulated tempering. As a result, one does not have 
an unbiased estimator of, say, a at temperature T — Tfc, but rather one estimates the ratio 

( a \ T = ^ k ■ a "> (9) 

where (•) without subscript denotes an average with respect to the joint distribution [see Eq. (0)], 
and \k is 1 if T = and otherwise. For parallel tempering the jackknife procedure is equivalent 
to a standard block analysis. 

The jackknife analysis was carried out for different bin sizes. For simulated tempering the depen- 
dence on bin size was weak. For parallel tempering, it was, by contrast, necessary to use relatively 
large bin sizes in order to get stable error estimates at low T. For this method, we also computed 
the autocorrelation function for a. Consistent with the jackknife analysis, the exponential auto- 
correlation time was estimated to be approximately 7 x 10 4 sweeps (same unit as in Fig. |l|) at 
T — T ■ 

The final statistical errors for the two methods differed by less than 10%. Thus, the algorithms 
are very similar in efficiency for this system. This is in line with the findings of Hansmann and 
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Figure 1: Evolution of the sum of all torsional angles, a, in three different simulations of the N = 12 
A homopolymer for (^1,^2) = (—1,0.5). (a) Standard Monte Carlo, (b) simulated tempering, and 
(c) parallel tempering. All data shown correspond to T = T m j n = 0.15. 



Okamoto p4| , who measured the time needed to find the ground state in an all-atom model for a 
small peptide, with similar results for simulated and parallel tempering. 



2.4 Measurements 



Two important observables in our study of the low-T behavior of these chains are the specific heat 

d(E) 1 



and the bond-bond correlation function 



N-d 

i=l 



(10) 



(11) 



which for d = 1 and 2 can be written as C&(g0 = —-Ed/ (JV — d) [see Eqs. (J2J) and @]. 
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We estimate the collapse temperature Tg by using the radius of gyration r g , defined by 




(12) 



The ratio (r 2 ,) /N tends to zero for fixed T < Tg, and to infinity for fixed T > Tg, as N — > oo. As is 
well-known, this means that Tg can be estimated from the intersections of curves corresponding to 
different N, in a plot of (r 2 g ) /N vs T. 

In our comparison of different N — 20 chains, we calculate their entropies S. The simulated- 
tempering algorithm is well suited for this purpose, since free-energy differences can be directly 
obtained from the marginal probabilities P(k) (uj. Using Eq. (||), one finds that 

rt o , , P(k) (E) k (E) l 

S k - Si=g k - gi + In — — + — — , (13) 

P\l) J-k J-l 

where Sk,i and (E)k,l denote the entropy and average energy respectively at T = T^,i. This 
expression is handy since the parameters g k are constants and the average energies are relatively 
easy to measure. The remaining term, \a.P(k)/ P{1), is just a small correction if the parameters g k 
are well chosen. 

Equation ([l3|) can be used to obtain the temperature dependence for a given chain. In order to 
compare the entropies of two different chains [different sequences and/or different (k!,k 2 )j but the 
same N], we use the umbrellalike |26| formula 

s ,_ s= (S^M +ln ^ p (__^^, (14) 

where the prime is used to distinguish the two systems. In Eq. ( |l4[ ) one could, of course, replace 
(E'Y by expectation values referring to the system without prime. 

All statistical errors quoted below are la errors, obtained by the jackknife method Ba|. 



3 Results 



We have performed simulations of the A homopolymers of the model defined in Sec. 2.1 for (ki , k%) — 
(0, 0) and (—1, 0.5). In Fig. || we show the temperature dependence of the specific heat for different 
chain lengths. For both choices of (ki, K2), it can be seen that the specific heat exhibits a pronounced 
peak at low temperatures, which could signal a phase transition. The height of the peak depends, 
however, only weakly on N. The N dependence is stronger at higher temperatures where another 
peak seems to develop. This shoulder or peak can most probably be associated with the collapse 
transition at Tg. In Sec. 3.1 we estimate Tg by using data for the radius of gyration. We then 
return to the N dependence at low temperatures in Sec. 3.2. Finally, in Sec. 3.3, we compare the 
behaviors of three different N — 20 sequences, the A homopolymer and the two AB heteropolymers 
in Table |. 



G 




Figure 2: Temperature dependence of the specific heat for A homopolymers for (a) (aci, K2) = (0, 0) 
and (b) (ki,K2) = (—1,0.5). 



3.1 Estimation of T e 



In order to estimate Xg, it is easier to use the radius of gyration than the specific heat. In Fig. |3^i 
we show the ratio {r*)/N against T for different N for (rtij/ta) = (0,0). In the temperature range 
relevant for the Tg estimation, we studied chains with up to 400 monomers (simulations are much 
easier here than at low T) . The temperature at which the curves corresponding to the sizes N and 
2N intersect, T N , was estimated to be 3.76 ± 0.11, 4.05 ± 0.06 and 4.33 ± 0.08 for N = 50, 100 and 
200, respectively. A least-square fit of these data points to the mean-field form 

m m const. , , 

T N = T e + — =- 15 



yields Tg = 4.88 ± 0.18. The systematic uncertainty in this extrapolation is difficult to estimate 
without data for larger N. Nevertheless, this analysis convincingly shows that Tg is indeed located 
well above the low-T maximum of the specific heat. 

In Fig. ||b we show the same plot for (ki,«2) = (—1,0.5). All data are for N < 50. They are very 
similar to those obtained for (k\, k%) = (0,0) at these N. In particular, it is clear that the collapse 
transition occurs well above the low-T maximum of the specific heat in this case too. 



3.2 The Low-T Behavior 



We now turn to the behavior around the low-T maximum of the specific heat. Figure |4j, which is 
an enlargement of Fig. |^, shows the specific heat itself in the vicinity of the maximum. A linear 
scaling with N of the height of the peak in C v /N would have implied a first-order phase transition. 
The data show a weak N dependence. The peak grows very slowly with N for (ki, kj) = (—1, 0.5), 
and not at all for (^1,^2) = (0,0). In neither case is there any sign that C v /N diverges. We also 
studied the energy distributions at the maxima. Consistent with the specific heat analysis, these 
were found to have a single peak, without any trace of bimodality. 
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Figure 3: Temperature dependence of (r 2 )/N for A homopolymers for (a) (ki,K2) 
(k u k 2 ) = (-1,0.5). 
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Figure 4: Low-T behavior of the specific heat for A homopolymers for (a) (ki, K2) = (0, 0) and (b) 
(ki,k 2 ) = (-1,0.5). 



The size of the chains, as measured by the radius of gyration, changes very little around the low- 
T maximum of the specific heat (see Fig. ||). The size is similar for the two choices of (ki,^), 
and very small at these temperatures. The bond-bond correlation Cb(d) [see Eq. ([ll])] provides 
information about the local structure of the chains. Unlike the radius of gyration, this quantity is 
strongly («i, K2) dependent. This can be seen from Fig. |[ which shows C&(d) against T for d = 2. 
The behavior of Cb(2) for (ki,K2) — (—1,0.5) shows that strong regularities develop in the local 
structure at low T. The change with decreasing T is, however, gradual, and does not become more 
abrupt with increasing N. In the absence of local interactions, («i,/t2) = (0,0), the correlations 
Cb(d) remain weak at low T. For small N there are some irregularities at very low T, but these 
disappear with increasing N . 



Let us compare these results with findings for lattice chains. As mentioned in the introduction, it 
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Figure 5: Temperature dependence of C'b(2) for A homopolymers for (a) (/ti,K2) = (0,0) and (b) 
(Ki./ta) = (-1,0.5). 

has been found 0, ||, ^, |l^| that lattice chains with stiffness exhibit a transition to an ordered low- 
T phase. Our (ki,K2) = (—1,0.5) chains, which have a certain degree of stiffness, do show strong 
regularities in the local structure at low T, as can be seen from the behavior of the correlation Cb(d). 
Also, the peak in the specific heat is more pronounced for these chains than for the (ki, k 2 ) = (0, 0) 
chains. However, we do not find any sign that a phase transition takes place. Although the N range 
probed is comparatively small, this indicates a possible difference between lattice and off-lattice 
chains. It would be interesting to systematically study how the off-lattice results depend on the 
degree of stiffness of the chains, but this is beyond the scope of the present paper. 

The low-T behavior of off-lattice models similar to ours has been studied before. As mentioned 
earlier, Baumgartner |2^, |23| studied a model very closely related to our (ki, K2) — (0, 0) model. Here 
the low-T peak in the specific heat was associated with a probably first-order phase transition. As 
we have seen, our results provide no support for the existence of such a transition. Zhou et al. 12 , 13) 



studied a model with square-well potentials, both for fixed and flexible bond lengths. For flexible 
bond lengths, evidence in terms of bimodal energy distributions was found for a first-order low-T 
phase transition. This contrasts sharply with the behavior of our chains, which, as mentioned above, 
have unimodal energy distributions. 



3.3 Heteropolymers 



In this section we compare the thermodynamic behaviors of the N — 20 A homopolymer and the 
two N = 20 heteropolymers S and U in Table [|. Each of the three sequences is studied both for 
(ki,K2) = (0,0) and (—1,0.5), which gives us a total of six different chains. The temperatures 
studied range between 0.15 and 4. One of the chains, sequence S with (ki,k 2 ) = (—1,0.5), has a 
folding temperature Tf « 0.23 > 0.15 Q and is referred to as the stable one, whereas Tf < 0.15 
for the others. 

In Fig. ||a we show the specific heat, which has a peak at low T for all the six chains. Note that the 
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Figure 6: (a) Specific heat and (b) entropy against temperature for the following six N — 20 chains: 
A homopolymer, (ki,K2) = (0,0) (+); sequences, (ki,k 2 ) — (0,0) (x); sequence U, (ki,«2) = (0,0) 
(fancy +); A homopolymer, (ki,K2) = (—1,0.5) (o); sequence S, (ki,k 2 ) — (—1,0.5) (□); sequence 
U, («i,«a) = (-1,0.5) (o). 



peak is highest for the stable chain, and that the maximum is at T ~ 0.30 for this chain, which is 
fairly close to its folding temperature. The perhaps most interesting part of this figure is, however, 
between T ss 0.4 and T sw 0.9. Here the data for the different chains approximately collapse onto 
two curves, each corresponding to a fixed k 2 ). Thus, the dependence on sequence is weak over 
this fairly wide range of temperature. 



2.4 



which is 



We also calculated the entropies of the chains, using the method described in Sec. 
convenient since it does not require numerical integration of any observable. In Fig. |6|b we show 
AS = S — Sq, where So is chosen as the entropy of the (ki,K2) = (—1,0.5) A homopolymer at 
T = 4. Thus, the constant So is the same for all the chains. From the figure it can be seen that 
the chains with (Ki,/ta) = ( — 1,0.5) have lower entropies than those with (k\,K2) = (0,0) at low 
T. This is not unexpected since the (ki, K2) — (— 1, 0.5) chains have stiffness. What still may seem 
somewhat surprising is that the stable chain, below its folding temperature, does not have lower 
entropy than the A homopolymer with the same (ki, k 2 ). This example underlines that the entropy 
is largely determined by local features at low T. The entropy does not tell whether the structure is 
globally stable or not (which it does in lattice models). 



4 Summary 



We have studied isolated off-lattice homopolymers with K2) = (— lj 0-5)] and without [(ki, k 2 ) — 
(0, 0)] local interactions. In both cases the specific heat exhibits a peak at low T. The peak is 
most pronounced for (ki,K2) = ( — 1,0.5). Also, the local structure of the chains develops strong 
regularities at low T for this choice of k 2 ). The dependence on system size is, however, weak at 
these temperatures; our results, obtained for N < 50, show no sign that a singularity develops with 
increasing N. 
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We have also compared homopolymer results to those for two non- uniform sequences, which were 
deliberately chosen to represent different types of behavior. Nevertheless, the specific heat and 
entropy show a relatively weak sequence dependence over wide ranges of low T. The dependence of 
these thermodynamic quantities upon the local interactions is, by contrast, strong. 

Our calculations have been performed by using simulated and parallel tempering. Our tests show 
that these two methods are indeed much more efficient than standard methods in the difficult low-T 
regime. The difference in efficiency between simulated and parallel tempering was found to be small. 

Note added in proof. After this manuscript was completed, we became aware of two studies p7| p8| 
of (k 1i k 2 ) = (0,0) A homopolymers. Grassberger and Hegger |28| estimated 0.2 < 1/Tg < 0.23, 
which is in perfect agreement with our estimate. 
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